Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.091
Filtrar
1.
EBioMedicine ; 101: 105027, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38418263

RESUMO

BACKGROUND: Cardiomyopathy is a clinically and genetically heterogeneous heart condition that can lead to heart failure and sudden cardiac death in childhood. While it has a strong genetic basis, the genetic aetiology for over 50% of cardiomyopathy cases remains unknown. METHODS: In this study, we analyse the characteristics of tandem repeats from genome sequence data of unrelated individuals diagnosed with cardiomyopathy from Canada and the United Kingdom (n = 1216) and compare them to those found in the general population. We perform burden analysis to identify genomic and epigenomic features that are impacted by rare tandem repeat expansions (TREs), and enrichment analysis to identify functional pathways that are involved in the TRE-associated genes in cardiomyopathy. We use Oxford Nanopore targeted long-read sequencing to validate repeat size and methylation status of one of the most recurrent TREs. We also compare the TRE-associated genes to those that are dysregulated in the heart tissues of individuals with cardiomyopathy. FINDINGS: We demonstrate that tandem repeats that are rarely expanded in the general population are predominantly expanded in cardiomyopathy. We find that rare TREs are disproportionately present in constrained genes near transcriptional start sites, have high GC content, and frequently overlap active enhancer H3K27ac marks, where expansion-related DNA methylation may reduce gene expression. We demonstrate the gene silencing effect of expanded CGG tandem repeats in DIP2B through promoter hypermethylation. We show that the enhancer-associated loci are found in genes that are highly expressed in human cardiomyocytes and are differentially expressed in the left ventricle of the heart in individuals with cardiomyopathy. INTERPRETATION: Our findings highlight the underrecognized contribution of rare tandem repeat expansions to the risk of cardiomyopathy and suggest that rare TREs contribute to ∼4% of cardiomyopathy risk. FUNDING: Government of Ontario (RKCY), The Canadian Institutes of Health Research PJT 175329 (RKCY), The Azrieli Foundation (RKCY), SickKids Catalyst Scholar in Genetics (RKCY), The University of Toronto McLaughlin Centre (RKCY, SM), Ted Rogers Centre for Heart Research (SM), Data Sciences Institute at the University of Toronto (SM), The Canadian Institutes of Health Research PJT 175034 (SM), The Canadian Institutes of Health Research ENP 161429 under the frame of ERA PerMed (SM, RL), Heart and Stroke Foundation of Ontario & Robert M Freedom Chair in Cardiovascular Science (SM), Bitove Family Professorship of Adult Congenital Heart Disease (EO), Canada Foundation for Innovation (SWS, JR), Canada Research Chair (PS), Genome Canada (PS, JR), The Canadian Institutes of Health Research (PS).


Assuntos
Cardiomiopatias , Cardiopatias Congênitas , Humanos , Adulto , Cardiopatias Congênitas/genética , Sequências de Repetição em Tandem/genética , Metilação de DNA , Cardiomiopatias/genética , Ontário , Proteínas do Tecido Nervoso/genética
2.
PLoS One ; 19(1): e0295595, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38271341

RESUMO

Mitochondria are known to play an essential role in the cell. These organelles contain their own DNA, which is divided in a coding and non-coding region (NCR). While much of the NCR's function is unknown, tandem repeats have been observed in several vertebrates, with extreme intra-individual, intraspecific and interspecific variation. Taking advantage of a new complete reference for the mitochondrial genome of the Afro-European Barn Owl (Tyto alba), as well as 172 whole genome-resequencing; we (i) describe the reference mitochondrial genome with a special focus on the repeats in the NCR, (ii) quantify the variation in number of copies between individuals, and (iii) explore the possible factors associated with the variation in the number of repetitions. The reference mitochondrial genome revealed a long (256bp) and a short (80bp) tandem repeat in the NCR region. The re-sequenced genomes showed a great variation in number of copies between individuals, with 4 to 38 copies of the Long and 6 to 135 copies of the short repeat. Among the factors associated with this variation between individuals, the tissue used for extraction was the most significant. The exact mechanisms of the formations of these repeats are still to be discovered and understanding them will help explain the maintenance of the polymorphism in the number of copies, as well as their interactions with the metabolism, the aging and health of the individuals.


Assuntos
Genoma Mitocondrial , Estrigiformes , Animais , Humanos , Variações do Número de Cópias de DNA , Estrigiformes/genética , Sequência de Bases , Sequências de Repetição em Tandem/genética
3.
Biochem Biophys Res Commun ; 692: 149349, 2024 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-38056160

RESUMO

While it is well established that a mere 2% of human DNA nucleotides are involved in protein coding, the remainder of the DNA plays a vital role in the preservation of normal cellular genetic function. A significant proportion of tandem repeats (TRs) are present in non-coding DNA. TRs - specific sequences of nucleotides that entail numerous repetitions of a given fragment. In this study, we employed our novel algorithm grounded in finite automata theory, which we refer to as Dafna, to investigate for the first time the likelihood of these nucleotide sequences forming non-canonical DNA structures (NS). Such structures include G-quadruplexes, i-motifs, hairpins, and triplexes. The tandem repeats under consideration in our research encompassed sequences containing 1 to 6 nucleotides per repeated fragment. For comparison, we employed a set of randomly generated sequences of the same length (60 nucleotides) as a benchmark. The outcomes of our research exposed a disparity between the potential for NS formation in random sequences and tandem repeats. Our findings affirm that the propensity of DNA and RNA to form NS is closely tied to various genetic disorders, including Huntington's disease, Fragile X syndrome, and Friedreich's ataxia. In the concluding discussion, we present a proposal for a new therapeutic mechanism to address these diseases. This novel approach revolves around the ability of specific nucleic acid fragments to form multiple types of NS.


Assuntos
Relevância Clínica , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , DNA/química , Sequência de Bases , Nucleotídeos
4.
Transl Psychiatry ; 13(1): 402, 2023 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-38123544

RESUMO

Tandem repeats (TRs) are prevalent throughout the genome, constituting at least 3% of the genome, and often highly polymorphic. The high mutation rate of TRs, which can be orders of magnitude higher than single-nucleotide polymorphisms and indels, indicates that they are likely to make significant contributions to phenotypic variation, yet their contribution to schizophrenia has been largely ignored by recent genome-wide association studies (GWAS). Tandem repeat expansions are already known causative factors for over 50 disorders, while common tandem repeat variation is increasingly being identified as significantly associated with complex disease and gene regulation. The current review summarizes key background concepts of tandem repeat variation as pertains to disease risk, elucidating their potential for schizophrenia association. An overview of next-generation sequencing-based methods that may be applied for TR genome-wide identification is provided, and some key methodological challenges in TR analyses are delineated.


Assuntos
Estudo de Associação Genômica Ampla , Esquizofrenia , Humanos , Esquizofrenia/genética , Genoma Humano , Sequências de Repetição em Tandem/genética , Polimorfismo de Nucleotídeo Único
5.
Sci Adv ; 9(47): eadj1261, 2023 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-37992162

RESUMO

The biological role of the repetitive DNA sequences in the human genome remains an outstanding question. Recent long-read human genome assemblies have allowed us to identify a function for one of these repetitive regions. We have uncovered a tandem array of conserved primate-specific retrogenes encoding the protein Elongin A3 (ELOA3), a homolog of the RNA polymerase II (RNAPII) elongation factor Elongin A (ELOA). Our genomic analysis shows that the ELOA3 gene cluster is conserved among primates and the number of ELOA3 gene repeats is variable in the human population and across primate species. Moreover, the gene cluster has undergone concerted evolution and homogenization within primates. Our biochemical studies show that ELOA3 functions as a promoter-associated RNAPII pause-release elongation factor with distinct biochemical and functional features from its ancestral homolog, ELOA. We propose that the ELOA3 gene cluster has evolved to fulfil a transcriptional regulatory function unique to the primate lineage that can be targeted to regulate cellular hyperproliferation.


Assuntos
Fatores de Alongamento de Peptídeos , RNA Polimerase II , Animais , Humanos , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Fatores de Alongamento de Peptídeos/genética , Primatas/genética , Elonguina/genética , Família Multigênica , Sequências de Repetição em Tandem/genética
6.
Emerg Top Life Sci ; 7(3): 361-381, 2023 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-37905568

RESUMO

Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.


Assuntos
DNA , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , Epigênese Genética
7.
Nat Commun ; 14(1): 6746, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37875492

RESUMO

De novo protein design methods can create proteins with folds not yet seen in nature. These methods largely focus on optimizing the compatibility between the designed sequence and the intended conformation, without explicit consideration of protein folding pathways. Deeply knotted proteins, whose topologies may introduce substantial barriers to folding, thus represent an interesting test case for protein design. Here we report our attempts to design proteins with trefoil (31) and pentafoil (51) knotted topologies. We extended previously described algorithms for tandem repeat protein design in order to construct deeply knotted backbones and matching designed repeat sequences (N = 3 repeats for the trefoil and N = 5 for the pentafoil). We confirmed the intended conformation for the trefoil design by X ray crystallography, and we report here on this protein's structure, stability, and folding behaviour. The pentafoil design misfolded into an asymmetric structure (despite a 5-fold symmetric sequence); two of the four repeat-repeat units matched the designed backbone while the other two diverged to form local contacts, leading to a trefoil rather than pentafoil knotted topology. Our results also provide insights into the folding of knotted proteins.


Assuntos
Dobramento de Proteína , Proteínas , Conformação Proteica , Proteínas/genética , Proteínas/química , Domínios Proteicos , Sequências de Repetição em Tandem/genética
8.
BMC Ecol Evol ; 23(1): 55, 2023 09 26.
Artigo em Inglês | MEDLINE | ID: mdl-37749487

RESUMO

BACKGROUND: The sturgeon group has been economically significant worldwide due to caviar production. Sturgeons consist of 27 species in the world. Mitogenome data could be used to infer genetic diversity and investigate the evolutionary history of sturgeons. A limited number of complete mitogenomes in this family were sequenced. Here, we annotated the mitochondrial Huso huso genome, which revealed new aspects of this species. RESULTS: In this species, the mitochondrial genome consisted of 13 genes encoding proteins, 22tRNA and 2rRNA, and two non-coding regions that followed other vertebrates. In addition, H. huso had a pseudo-tRNA-Glu between ND6 and Cytb and a 52-nucleotide tandem repeat with two replications in 12S rRNA. This duplication event is probably related to the slipped strand during replication, which could remain in the strand due to mispairing during replication. Furthermore, an 82 bp repeat sequence with three replications was observed in the D-loop control region, which is usually visible in different species. Regulatory elements were also seen in the control region of the mitochondrial genome, which included termination sequences and conserved regulatory blocks. Genomic compounds showed the highest conservation in rRNA and tRNA, while protein-encoded genes and nonencoded regions had the highest divergence. The mitochondrial genome was phylogenetically assayed using 12 protein-encoding genes. CONCLUSIONS: In H. huso sequencing, we identified a distinct genome organization relative to other species that have never been reported. In recent years, along with the advancement in sequencing identified more genome rearrangements. However, it is an essential aspect of researching the evolution of the mitochondrial genome that needs to be recognized.


Assuntos
Genoma Mitocondrial , Animais , Genoma Mitocondrial/genética , Peixes/genética , Sequências de Repetição em Tandem/genética , RNA de Transferência/genética
9.
PLoS One ; 18(9): e0290890, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37729217

RESUMO

Protein regions consisting of arrays of tandem repeats are known to bind other molecular partners, including nucleic acid molecules. Although the interactions between repeat proteins and DNA are already widely explored, studies characterising tandem repeat RNA-binding proteins are lacking. We performed a large-scale analysis of human proteins devoted to expanding the knowledge about tandem repeat proteins experimentally reported as RNA-binding molecules. This work is timely because of the release of a full set of accurate structural models for the human proteome amenable to repeat detection using structural methods. The main goal of our analysis was to build a comprehensive set of human RNA-binding proteins that contain repeats at the sequence or structure level. Our results showed that the combination of sequence and structural methods finds significantly more tandem repeat proteins than either method alone. We identified 219 tandem repeat proteins that bind RNA molecules and characterised the overlap between repeat regions and RNA-binding regions as a first step towards assessing their functional relationship. We observed differences in the characteristics of repeat regions predicted by sequence-based or structure-based methods in terms of their sequence composition, their functions and their protein domains.


Assuntos
Conhecimento , Proteínas de Ligação a RNA , Humanos , Modelos Estruturais , Proteínas de Ligação a RNA/genética , Sequências de Repetição em Tandem/genética , RNA/genética
10.
J Struct Biol ; 215(4): 108023, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37652396

RESUMO

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Assuntos
Proteínas , Sequências de Repetição em Tandem , Proteínas/genética , Proteínas/química , Sequências de Repetição em Tandem/genética , Sequência de Aminoácidos
11.
Nature ; 621(7978): 344-354, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37612512

RESUMO

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.


Assuntos
Cromossomos Humanos Y , Genômica , Análise de Sequência de DNA , Humanos , Sequência de Bases , Cromossomos Humanos Y/genética , DNA Satélite/genética , Variação Genética/genética , Genética Populacional , Genômica/métodos , Genômica/normas , Heterocromatina/genética , Família Multigênica/genética , Padrões de Referência , Duplicações Segmentares Genômicas/genética , Análise de Sequência de DNA/normas , Sequências de Repetição em Tandem/genética , Telômero/genética
12.
J Struct Biol ; 215(3): 108001, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37467824

RESUMO

Structured tandem repeats proteins (STRPs) are a specific kind of tandem repeat proteins characterized by a modular and repetitive three-dimensional structure arrangement. The majority of STRPs adopt solenoid structures, but with the increasing availability of experimental structures and high-quality predicted structural models, more STRP folds can be characterized. Here, we describe "Box repeats", an overlooked STRP fold present in the DNA sliding clamp processivity factors, which has eluded classification although structural data has been available since the late 1990s. Each Box repeat is a ß⍺ßßß module of about 60 residues, which forms a class V "beads-on-a-string" type STRP. The number of repeats present in processivity factors is organism dependent. Monomers of PCNA proteins in both Archaea and Eukarya have 4 repeats, while the monomers of bacterial beta-sliding clamps have 6 repeats. This new repeat fold has been added to the RepeatsDB database, which now provides structural annotation for 66 Box repeat proteins belonging to different organisms, including viruses.


Assuntos
Proteínas , Sequências de Repetição em Tandem , Proteínas/química , Sequências de Repetição em Tandem/genética , DNA/genética
13.
Emerg Top Life Sci ; 7(3): 249-263, 2023 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-37401564

RESUMO

The human genome contains numerous genetic polymorphisms contributing to different health and disease outcomes. Tandem repeat (TR) loci are highly polymorphic yet under-investigated in large genomic studies, which has prompted research efforts to identify novel variations and gain a deeper understanding of their role in human biology and disease outcomes. We summarize the current understanding of TRs and their implications for human health and disease, including an overview of the challenges encountered when conducting TR analyses and potential solutions to overcome these challenges. By shedding light on these issues, this article aims to contribute to a better understanding of the impact of TRs on the development of new disease treatments.


Assuntos
Encefalopatias , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , Genoma Humano , Genômica , Polimorfismo Genético , Encefalopatias/genética
14.
Curr Microbiol ; 80(8): 255, 2023 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-37356021

RESUMO

Unlike environmental P. koreensis isolated from soil, which has been studied extensively for its role in promoting plant growth, pathogenic P. koreensis isolated from fish has been rarely reported. Therefore, we investigated and isolated the possible pathogen that is responsible for the diseased state of Tor tambroides. Herein, we reported the morphological and biochemical characteristics, as well as whole-genome sequences of a newly identified P. koreensis strain. We assembled a high-quality draft genome of P. koreensis CM-01 with a contig N50 value of 233,601 bp and 99.5% BUSCO completeness. The genome assembly of P. koreensis CM-01 is consists of 6,171,880 bp with a G+C content of 60.5%. Annotation of the genome identified 5538 protein-coding genes, 3 rRNA genes, 54 tRNAs, and no plasmids were found. Besides these, 39 interspersed repeat and 141 tandem repeat sequences, 6 prophages, 51 genomic islands, 94 insertion sequences, 4 clustered regularly interspaced short palindromic repeats, 5 antibiotic-resistant genes, and 150 virulence genes were also predicted in the P. koreensis CM-01 genome. Culture-based approach showed that CM-01 strain exhibited resistance against ampicillin, aztreonam, clindamycin, and cefoxitin with a calculated multiple antibiotic resistance (MAR) index value of 0.4. In addition, the assembled CM-01 genome was successfully annotated against the Cluster of Orthologous Groups of proteins database, Gene Ontology database, and Kyoto Encyclopedia of Genes and Genome pathway database. A comparative analysis of CM-01 with three representative strains of P. koreensis revealed that 92% of orthologous clusters were conserved among these four genomes, and only the CM-01 strain possesses unique elements related to pathogenicity and virulence. This study provides fundamental phenotypic and genomic information for the newly identified P. koreensis strain.


Assuntos
Peixes , Pseudomonas , Sequenciamento Completo do Genoma , Animais , Resistência Microbiana a Medicamentos/genética , Doenças dos Peixes/microbiologia , Peixes/microbiologia , Malásia , Filogenia , Prófagos/genética , Sequências de Repetição em Tandem/genética , Virulência/genética , Pseudomonas/classificação , Pseudomonas/efeitos dos fármacos , Pseudomonas/genética , Pseudomonas/isolamento & purificação , Genoma Bacteriano , Genótipo , Fenótipo
15.
Mol Biol Rep ; 50(6): 5137-5146, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37115485

RESUMO

BACKGROUND: Tandem repeats in mitochondrial DNA control region are known to different animal taxa, including bat species of the family Vespertilionidae. The long R1-repeats in the bat ETAS-domain are often presented in a variable copy number and may exhibit both inter-individual and intra-individual sequence diversity. The function of repeats in the control region is still unclear, but it has been shown that repetitive sequences in some animal groups (shrews, cats and sheep) may include parts of ETAS1 and ETAS2 conservative blocks of mitochondrial DNA. METHODS AND RESULTS: Analysis of the control region sequences for 31 Myotis petax specimens allowed the identification of the inter-individual variability and clarification of the composition of the R1-repeats. The copy number of the R1-repeats varies from 4 to 7 in individuals. The specimens examined do not exhibit a size heteroplasmy previously described for Myotis species. The unusual short 30 bp R1-repeats have been detected in M. petax for the first time. The ten specimens from Amur Region and Primorsky Territory have one or two copies of these additional repeats. CONCLUSIONS: It was determined that the R1-repeats in M. petax control region consist of parts of the ETAS1 and ETAS2 blocks. The origin of the additional repeats seems to be related to the 51 bp deletion in the central part of the R1-repeat unit and subsequent duplication. Comparison of repetitive sequences in the control region of closely-related Myotis species identified the occurrence of incomplete repeats also resulting from the short deletions, but distinct from additional repeats of M. petax.


Assuntos
Quirópteros , Animais , Ovinos/genética , Quirópteros/genética , Sequências Repetitivas de Ácido Nucleico/genética , Mitocôndrias/genética , DNA Mitocondrial/genética , Sequências de Repetição em Tandem/genética
16.
PLoS One ; 18(4): e0281228, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37043448

RESUMO

Protein tandem repeats (TRs) are motifs comprised of near-identical contiguous sequence duplications. They are found in approximately 14% of all proteins and are implicated in diverse biological functions facilitating both structured and disordered protein-protein and protein-DNA interactions. These functionalities make protein TR domains an attractive component for the modular design of protein constructs. However, the repetitive nature of DNA sequences encoding TR motifs complicates their synthesis and mutagenesis by traditional molecular biology workflows commonly employed by protein engineers and synthetic biologists. To address this challenge, we developed a computational protocol to significantly reduce the complementarity of DNA sequences encoding TRs called TReSR (for Tandem Repeat DNA Sequence Redesign). The utility of TReSR was demonstrated by constructing a novel constitutive repressor synthesized by duplicating the LacI DNA binding domain into a single-chain TR construct by assembly PCR. Repressor function was evaluated by expression of a fluorescent reporter delivered on a single plasmid encoding a three-component genetic circuit. The successful application of TReSR to construct a novel TR-containing repressor with a DNA sequence that is amenable to PCR-based construction and manipulation will enable the incorporation of a wide range of TR-containing proteins for protein engineering and synthetic biology applications.


Assuntos
Proteínas , Sequências de Repetição em Tandem , Sequência de Bases , Proteínas/química , Sequências de Repetição em Tandem/genética , Engenharia de Proteínas , Reação em Cadeia da Polimerase
17.
Biosystems ; 226: 104869, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36858110

RESUMO

The sequencing of eukaryotic genomes has shown that tandem repeats are abundant in their sequences. In addition to affecting some cellular processes, tandem repeats in the genome may be associated with specific diseases and have been the key to resolving criminal cases. Any tool developed for detecting tandem repeats must be accurate, fast, and useable in thousands of laboratories worldwide, including those with not very advanced computing capabilities. The proposed method, the Rapid Perfect Tandem Repeat Finder (RPTRF), minimizes the need for excess character comparison processing by indexing the input file and significantly helps to accelerate and prepare the output without artifacts by using an interval tree in the filtering section. The experiments demonstrated that the RPTRF is very fast in discovering all perfect tandem repeats of all categories of any genomic sequences. Although the detection of imperfect TRs is not the focus of the RPTRF, comparisons show that it even outperforms some other tools (in five selected gold standards) designed explicitly for this purpose. The implemented tool and how to use it are available on GitHub.


Assuntos
Genômica , Sequências de Repetição em Tandem , Sequência de Bases , Sequências de Repetição em Tandem/genética , Análise de Sequência de DNA
18.
Methods Mol Biol ; 2632: 147-159, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36781727

RESUMO

Abnormal expansion or shortening of tandem repeats can cause a variety of genetic diseases. The use of long DNA reads has facilitated the analysis of disease-causing repeats in the human genome. Long read sequencers enable us to directly analyze repeat length and sequence content by covering whole repeats; they are therefore considered suitable for the analysis of long tandem repeats. Here, we describe an expanded repeat analysis using target sequencing data produced by the Oxford Nanopore Technologies (hereafter referred to as ONT) nanopore sequencer.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Nanoporos , Humanos , Sequências de Repetição em Tandem/genética , Análise de Sequência de DNA , DNA/genética
19.
BMC Bioinformatics ; 24(1): 62, 2023 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-36823555

RESUMO

Internal tandem duplication (ITD) of the FMS-like tyrosine kinase (FLT3) gene is associated with poor clinical outcomes in patients with acute myeloid leukemia. Although recent methods for detecting FLT3-ITD from next-generation sequencing (NGS) data have replaced traditional ITD detection approaches such as conventional PCR or fragment analysis, their use in the clinical field is still limited and requires further information. Here, we introduce ITDetect, an efficient FLT3-ITD detection approach that uses NGS data. Our proposed method allows for more precise detection and provides more detailed information than existing in silico methods. Further, it enables FLT3-ITD detection from exome sequencing or targeted panel sequencing data, thereby improving its clinical application. We validated the performance of ITDetect using NGS-based and experimental ITD detection methods and successfully demonstrated that ITDetect provides the highest concordance with the experimental methods. The program and data underlying this study are available in a public repository.


Assuntos
Leucemia Mieloide Aguda , Receptor 1 de Fatores de Crescimento do Endotélio Vascular , Humanos , Proteínas Tirosina Quinases/genética , Sequências de Repetição em Tandem/genética , Leucemia Mieloide Aguda/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Tirosina Quinase 3 Semelhante a fms/genética , Mutação , Duplicação Gênica
20.
PLoS Biol ; 21(1): e3001980, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36701369

RESUMO

Borgs are huge, linear extrachromosomal elements associated with anaerobic methane-oxidizing archaea. Striking features of Borg genomes are pervasive tandem direct repeat (TR) regions. Here, we present six new Borg genomes and investigate the characteristics of TRs in all ten complete Borg genomes. We find that TR regions are rapidly evolving, recently formed, arise independently, and are virtually absent in host Methanoperedens genomes. Flanking partial repeats and A-enriched character constrain the TR formation mechanism. TRs can be in intergenic regions, where they might serve as regulatory RNAs, or in open reading frames (ORFs). TRs in ORFs are under very strong selective pressure, leading to perfect amino acid TRs (aaTRs) that are commonly intrinsically disordered regions. Proteins with aaTRs are often extracellular or membrane proteins, and functionally similar or homologous proteins often have aaTRs composed of the same amino acids. We propose that Borg aaTR-proteins functionally diversify Methanoperedens and all TRs are crucial for specific Borg-host associations and possibly cospeciation.


Assuntos
Archaea , Sequências de Repetição em Tandem , Archaea/genética , Sequências de Repetição em Tandem/genética , Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...